SHAP中使用AdaBoost

🔖 interpretability
🔖 machine learning
Author

Guangyao Zhao

Published

Jan 7, 2023

SHAP 是一个非常好用的可解释性机器学习方法,但是原生并未对 AdaBoost 进行支持。此时需要自己在 SHAP 源代码中添加一些特定代码即可:

添加地址:/Users/wenv/anaconda3/lib/python3.9/site-packages/shap/explainers/_tree.py,大概添加在 1300 行左右,此处请改成自己的地址。

#TODO: 自己添加的对AdaBoost的支持
elif safe_isinstance(
        model,
    ("sklearn.ensemble.AdaBoostClassifier",
        "sklearn.ensemble._weighted_boosting.AdaBoostClassifier",
        "imblearn.ensemble.RUSBoostClassifier",
        "imblearn.ensemble._weight_boosting.RUSBoostClassifier")):
    assert hasattr(
        model, "estimators_"
    ), "Model has no `estimators_`! Have you called `model.fit`?"
    self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
    self.input_dtype = np.float32
    self.trees = [
        SingleTree(e.tree_,
                    normalize=True,
                    scaling=weight,
                    data=data,
                    data_missing=data_missing) for e, weight in zip(
                        model.estimators_, model.estimator_weights_ /
                        sum(model.estimator_weights_))
    ]
    self.objective = objective_name_map.get(
        model.base_estimator_.criterion, None
    )  #This line is done to get the decision criteria, for example gini.
    self.tree_output = "probability"
elif safe_isinstance(
        model,
    ("sklearn.ensemble._weighted_boosting.AdaBoostRegressor",
        "sklearn.ensemble.AdaBoostRegressor")):
    assert hasattr(
        model, "estimators_"
    ), "Model has no `estimators_`! Have you called `model.fit`?"
    self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
    self.input_dtype = np.float32
    self.trees = [
        SingleTree(e.tree_,
                    scaling=weight,
                    data=data,
                    data_missing=data_missing) for e, weight in zip(
                        model.estimators_, model.estimator_weights_ /
                        sum(model.estimator_weights_))
    ]
    self.objective = objective_name_map.get(
        model.base_estimator_.criterion, None
    )  #This line is done to get the decision criteria, for example gini.
    self.tree_output = "raw_value"